Exploiting Temporal Sequence Structure for Semantic Analysis of Multimedia
نویسندگان
چکیده
Automatic deduction of semantic labels for audiovisual data requires awareness of context, which in turn requires processing sequences of audiovisual scenes or events. The representation of such sequences is important for semantic analysis tasks. Whereas, conventionally, sequences of specific short-duration event labels, often hand-annotated for learning detectors or classifiers, have been used, we propose a new technique for audiovisual event categorization in this paper, wherein units of audio and image scenes are discovered automatically from data in a likelihood-maximization process. We show how these units for audio and video, respectively called AUDs and VIDs, can be used to learn the salient characteristics of broad-category semantic labels without requiring explicit error recovery measures. Experiments with the MED-11 dataset show that AUDs and VIDs are better able to retrieve semantic categories from mixed-content data as compared to vector quantization-based systems and systems that use library-based descriptors. Index: multimedia analysis, semantic labels, unsupervised lexicon learning, audiovisual data retrieval
منابع مشابه
A Spatio-Temporal Semantic Model for Multimedia Database Systems and Multimedia Information Systems
ÐAs more information sources become available in multimedia systems, the development of abstract semantic models for video, audio, text, and image data becomes very important. An abstract semantic model has two requirements: It should be rich enough to provide a friendly interface of multimedia presentation synchronization schedules to the users and it should be a good programming data structur...
متن کاملDynamic Generation of Intelligent Multimedia Presentations through Semantic Inferencing
This paper first proposes a high-level architecture for semi-automatically generating multimedia presentations by combining semantic inferencing with multimedia presentation generation tools. It then describes a system, based on this architecture, which was developed as a service to run over OAI archives but is applicable to any repositories containing mixed-media resources described using Dubl...
متن کاملAUGMENTED TRANSITION NETWORKS AS SEMANTIC MODELS FOR MULTIbAEDIA PRESENTATIONS, MULTIMEDIA DATABASE SEAR.CHING, AND MULTIMEDIA BROWSING
As more information sources become available in multimedia systems, the development of abstract semantic models for video, audio, text, and image data becomes very important. An abstract semantic model has two requirements. First, it should be rich enolugh to provide a friendly interface of multimedia presentation synchronization schedules to tlie users. Second, it should be a good programming ...
متن کاملReverse Engineering of Network Software Binary Codes for Identification of Syntax and Semantics of Protocol Messages
Reverse engineering of network applications especially from the security point of view is of high importance and interest. Many network applications use proprietary protocols which specifications are not publicly available. Reverse engineering of such applications could provide us with vital information to understand their embedded unknown protocols. This could facilitate many tasks including d...
متن کاملTemporal Semantic Motion Segmentation Using Spatio Temporal Optimization
Segmenting moving objects in a video sequence has been a challenging problem and critical to outdoor robotic navigation. While recent literature has laid focus on regularizing object labels over a sequence of frames, exploiting the spatio-temporal features for motion segmentation has been scarce. Particularly in real world dynamic scenes, existing approaches fail to exploit temporal consistency...
متن کامل